Cluster Analysis, Model Selection, and Prior Distributions on Models
نویسندگان
چکیده
Clustering is an important and challenging statistical problem for which there is an extensive literature. Modeling approaches include mixture models and product partition models. Here we develop a product partition model and a model selection procedure based on Bayes factors from intrinsic priors. We also find that the choice of the prior on model space is of utmost importance, almost overshadowing the other parts of the clustering problem, and we examine the behavior of posterior odds based on different model space priors. We find, somewhat surprisingly, that procedures based on the often-used uniform prior (in which all models are given the same prior probability) lead to inconsistent model selection procedures. We examine other priors, and find that a new prior, the hierarchical uniform prior leads to consistent model selection procedures and has other desirable properties. Lastly, we examine our procedures, and competitors, on a range of examples.
منابع مشابه
Introducing of Dirichlet process prior in the Nonparametric Bayesian models frame work
Statistical models are utilized to learn about the mechanism that the data are generating from it. Often it is assumed that the random variables y_i,i=1,…,n ,are samples from the probability distribution F which is belong to a parametric distributions class. However, in practice, a parametric model may be inappropriate to describe the data. In this settings, the parametric assumption could be r...
متن کاملNature Methods jModelTest 2 : more models , new heuristics and parallel computing
jModelTest 2: more models, new heuristics and parallel computing Diego Darriba, Guillermo L. Taboada, Ramón Doallo and David Posada Supplementary Table 1. New features in jModelTest 2 Supplementary Table 2. Model selection accuracy Supplementary Table 3. Mean square errors for model averaged estimates Supplementary Note 1. Hill-climbing hierarchical clustering algorithm Supplementary Note 2. He...
متن کاملComparison of Linear and Threshold Models for Estimation Genetic and Phenotypic Parameters of Success of Conception at First Service and Inseminations to Conception in Holstein Cattles in East Azarbayjan Province
In this research genetic and phenotypic parameters were estimated using linear and threshold models, for reproductive traits, data from 6 large industrial dairy herd of East Azerbaijan province collected by Agriculture Jihad Organization during 10 years (2001-2010). Best linear unbiased predictions of traits breeding values were estimated using Restricted Maximum Likelihood method by WOMBAT sof...
متن کاملComparison of Linear and Threshold Models for Estimation Genetic and Phenotypic Parameters of Success of Conception at First Service and Inseminations to Conception in Holstein Cattles in East Azarbayjan Province
In this research genetic and phenotypic parameters were estimated using linear and threshold models, for reproductive traits, data from 6 large industrial dairy herd of East Azerbaijan province collected by Agriculture Jihad Organization during 10 years (2001-2010). Best linear unbiased predictions of traits breeding values were estimated using Restricted Maximum Likelihood method by WOMBAT sof...
متن کاملThe Tail Mean-Variance Model and Extended Efficient Frontier
In portfolio theory, it is well-known that the distributions of stock returns often have non-Gaussian characteristics. Therefore, we need non-symmetric distributions for modeling and accurate analysis of actuarial data. For this purpose and optimal portfolio selection, we use the Tail Mean-Variance (TMV) model, which focuses on the rare risks but high losses and usually happens in the tail of r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011